-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: always include the Invariants writerFeature when upgrading to v7 #2957
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2957 +/- ##
==========================================
+ Coverage 72.35% 72.37% +0.02%
==========================================
Files 131 131
Lines 40602 40644 +42
Branches 40602 40644 +42
==========================================
+ Hits 29376 29416 +40
- Misses 9336 9339 +3
+ Partials 1890 1889 -1 ☔ View full report in Codecov by Sentry. |
The presence of a timestampntz column in CreateBuilder results in the reader/writer being set to 3/7. The protocol reaquires that if the table is writer v7, that `invariants` "must exist in the table protocl's writerFeatures. Sponsored-by: Scribd Inc Signed-off-by: R. Tyler Croy <[email protected]>
4408dd7
to
4a881ef
Compare
@rtyler to my understanding it's not required, but it's assumed when a table has NOT NULL columns |
See discussion here: #2882 (comment) |
I added this because of the error and this statement in the protocol
|
@ion-elgreco I should also mention that this came from a table created by delta-rs and does not contain any actual invariants. I believe newer DBRs and Delta/Spark releases are more strictly adhering to the protocol and expect invariants to be defined on writerVersion 7 The phrasing of the protocol is lousy here. append only, check constraints, and a few other features have similar "to support this feature it must be in the writerFeatures verbiage. But as best as I can tell Delta/Spark is only being pedantic about
|
It's a bit poorly written, but the sentence starts with "To support this feature". I also misread it before and assumed it was required but it's just to support it in v7 we need to add the feature. Did you check when you create a v3,7 table with nullable columns in spark what the protocol log states? |
Okay so I did some testing with PySpark and I am going to pull this into draft again to modify the changes a bit. Essentially if a non-nullable column exists and things are upgraded to writer7, the This behavior is not specified anywhere in the protocol which is fun 💀 DeltaTable.create(spark).tableName('nonulliz').addColumn('ts', 'TIMESTAMP_NTZ', nullable=False).location('./nonulliz').execute() {
"commitInfo": {
"timestamp": 1729706895482,
"operation": "CREATE TABLE",
"operationParameters": {
"partitionBy": "[]",
"clusterBy": "[]",
"description": null,
"isManaged": "false",
"properties": "{}"
},
"isolationLevel": "Serializable",
"isBlindAppend": true,
"operationMetrics": {},
"engineInfo": "Apache-Spark/3.5.3 Delta-Lake/3.2.0",
"txnId": "1e0e57c1-d1a2-41d9-9402-53516b8f4c77"
}
}
{
"metaData": {
"id": "721d696f-8396-4a69-851c-23702b8b61bb",
"format": {
"provider": "parquet",
"options": {}
},
"schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"ts\",\"type\":\"timestamp_ntz\",\"nullable\":false,\"metadata\":{}}]}",
"partitionColumns": [],
"configuration": {},
"createdTime": 1729706895473
}
}
{
"protocol": {
"minReaderVersion": 3,
"minWriterVersion": 7,
"readerFeatures": [
"timestampNtz"
],
"writerFeatures": [
"timestampNtz",
"invariants"
]
}
} |
The presence of a timestampntz column in CreateBuilder results in the reader/writer being set to 3/7. The protocol reaquires that if the table is writer v7, that
invariants
"must exist in the table protocl's writerFeatures.Sidenote: Yes I know that we cannot actually support invariant utilization in delta-rs, but creating a table with
timestampntz
will (rightfully) be invalid and unreadable by Databricks and other compliant Delta readers:Sponsored-by: Scribd Inc